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Abstract 

We present a general method for converting any family of unsatisfiable CNF formulas that is 
hard for one of the simplest proof systems, tree resolution (ordinary backtracking search), into 
formulas that require large rank in very strong proof systems, which include any proof system 
that manipulates polynomials or polynomial threshold functions of degree at most k (known as 
Th(fc) proofs). These include high degree versions of Lovasz-Schrijver systems, (LS(fc), LS+(fc)), 
high degree versions of Cutting Planes proofs (CP(fc)), Sherali- Adams, and Lasserre proofs. 

We introduce two very general families of these very strong proof systems, denoted by T'^'^{k) 
and R^'^{k). The proof lines of T^'^{k) are arbitrary Boolean functions, each of which can be 
evaluated by an efficient fc-party randomized communication protocol. T'^'^{k) proofs include 
Th{k — 1) proofs as a special case. R'^'^ik) proofs generalize T^'^{k) proofs and require only 
that each inference be checkable (in a certain weak sense) by an efficient fc-party randomized 
communication protocol. 

Our main results are the following: 

• For all fc G O (log log n), any unsatisfiable CNF formula F requiring resolution rank r 
can be converted to a related CNF formula G = Liftfc(f) requiring refutation rank 
f^i^/k) ^ iQgOii) ^ jj-^ g^jj R'^'^{k) systems. Since resolution rank is at least resolution width 
(for which many strong lower bounds are known), this yields strong rank lower bounds in 
all of the above proof systems for large classes of natural CNF formulas. 

• There are strict hierarchies for T^'^(fc) and _R^'^(fc) systems with respect to fc. Specifically, 
for any fc that is O (log log n), there are unsatisfiable CNF formulas whose proofs require 
large rank in _R^'^(fc) but which can be refuted via polylogarithmic rank CP(fc) proofs. Rank 
separations between CP(fc — 1) and CP(fc), between Th(fc — 1) and Th(fc), and between 
R'^'ik) and T'"'{k -f 1) follow immediately. 

• When fc is O(loglogn) we also derive 2"'^*^'''°' lower bounds on the size of tree-like T'^'^(k) 
refutations for large classes of lifted CNF formulas. Moreover, the rank hierarchies extend 
to nearly exponential separations in tree-like proof size. 

• A general method for producing integrality gaps for low rank R'^'^{2) inference based on 
related gaps for low rank resolution. This yields integrality gaps for low rank Cutting 
Planes and more general Th(l) inference. These gaps are optimal for MAX-2i-SAT. 
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1 Introduction 



Over the last decade or so there have been a large number of results proving lower bounds on the 
rank required to refute (or approximately optimize over) systems of constraints in a wide variety of 
semi-algebraic (a.k.a. polynomial threshold) proof systems. These include systems such as Lovasz- 
Schrijver [32], Cutting Planes |19ill2j . Positivstellensatz |22j, Sherali- Adams [33], and Lasserre [30] 
proofs. 

Highlights of this work include recent linear rank lower bounds for Lasserre proofs [401 138] 
for many constraint optimization problems as well as rank lower bounds for semi-algebraic proof 
systems of varying degrees for other important optimization problems \10\ [38] [26] [18] 132] [3T] 117] . 
In addition to these rank lower bounds a few other papers also have produced superpolynomial 
lower bounds on the size of tree-like proofs (in which the pattern of inferences forms a tree) in 
specific semi-algebraic proof systems either directly |2H [20] [27] or as a consequence of the rank 
lower bounds [33]. 

Exciting and important as these results are, their proofs rely on delicate constructions of 
problem-specific local distributions on inputs that satisfy constraints based on the specific rules 
for each proof system. Furthermore, because there is not much in the way of effective reductions 
for such proof systems, lower bounds for one problem usually do not translate to other problems. 

A very different approach for proving lower bounds for semi-algebraic proofs was developed 
in [2], whereby the problems of lower-bounding the rank or tree-like proof size are reduced to a 
lower bound problem in communication complexity. This allows the results to be applicable to a 
much wider class of proof systems, called Th(A;) proofs, which generalizes the semi-algebraic proof 
systems discussed above. In these systems, a proof consists of a sequence of lines, each of which 
is a multivariate polynomial inequality of degree at most k; the only requirement is that each line 
either expresses an input constraint or is a semantic consequence of a constant number (say two 
or three) of its predecessors. [2j showed that if an unsatisfiable CNF formula G has a small-rank 
(or small tree-like size) Th(A; — 1) refutation then, over every partition of the variables of G for k 
parties, there is an efficient /c-party randomized NOF protocol that outputs a falsified clause in G. 
Thus to lower bound the rank of Th(A; — 1) proofs it suffices find an unsatisfiable family of CNF 
formulas with the property that the /c-player NOF complexity of this underlying search problem 
(outputting a falsified clause) is hard. 

However, though this communication complexity approach was de-coupled from the specifics of 
the proof system, like the other lower bounds on semi-algebraic proofs, the reduction given in |2j was 
very problem-specific and delicate. One source of the difficulty was that the clause search problem 
needs to be hard for randomized protocols to solve but is always easy nondeterministically, as the 
players can easily guess and verify a violated clause. Much of the delicacy of the argument was in 
carefully embedding a specific candidate function (set disjointness), which appeared to have these 
characteristics, into the search problem of an unsatisfiable CNF. 

Indeed, using a long and involved argument, [2j showed the feasibility of this communication 
complexity approach by constructing a particular family of CNF formulas, {k — l)-fold Tseitin 
tautologies over G(log n)-degree LPS expander graphs, such that lower bounds on the /c-party ran- 
domized NOF communication complexity of the /c-party set disjointness function yield rank and 
tree-like size lower bounds for Th(/c — 1) refutations. The recent lower bounds of Lee and Shraib- 
man [3T] and Chattopadhyay and Ada [TT] for the fc-party randomized communication complexity 
of set disjointness thus yield unconditional rank bounds for Th(A;) proofs. Unfortunately, though 
the set disjointness bounds apply for k up to (1 — o(l)) log log n, the details of the reduction in [2], 
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which was claimed for each constant A;, only apply for k = O(logloglogn). Moreover, the method 
only applies to this one particular family of unsatisfiable formulas, and no other lower bounds for 
Th(A;) proofs have been known by any other method. 

In this paper for the first time we provide a simple and general method that produces unsat- 
isfiable formulas requiring proofs of large rank and tree-like size in semi-algebraic proof systems. 
This applies to a broad range of systems including all of Th(A;) for k up to (1 — o(l)) log log n. Our 
method allows one to take any unsatisfiable formula requiring large rank in a very simple proof 
system, resolution, and derive new formulas that require large rank and tree-like proof size in these 
very powerful semi-algebraic systems. In particular, this construction applies to all formulas of 
large resolution width [5] since resolution width is a lower bound on resolution rank. A simplified 
statement of our main result is the following. 

Theorem 1.1. Let F be any family of 3- CNF formulas inm variables with resolution rankr. Then 
for any e > and integer k > 1 there is a family of CNF formulas, G = Liftk{F) of size n = m^^^^ 
such that if k < (1 — e) log log n then G requires Th{k + 1) refutations of rank r^*-^)/ log*^*-^-* n and 
tree- size exp{r^^^^). In particular, if r is mP^^^ then G requires Th{k + 1) rankn^^^/^\ and tree- size 
exp(n^(i/'^)). 

Our lower bounds are much more general than this statement. In particular, our proof shows 
that the lifted formula requires large rank in any proof system in which the truth of each line in a 
proof can be verified by an efficient fc-party randomized communication protocol; the above theorem 
follows by the reduction in [2]. Our lower bounds also apply to proof systems in which individual 
proof lines may not be efficiently verifiable but in which any falsifying assignment at an inferred 
line can be traced to one of its antecedents using an efficient /c-party randomized communication 
protocol. 

Our method is an example of a kind of hardness amplification that we term a "hardness esca- 
lation" method, whereby one takes an object, in this case an unsatisfiable 3-CNF formula, that is 
hard for a weak complexity measure and produces another object, the lifted formula in this case, 
that is hard for a much stronger complexity measure. This is related to but different from typical 
hardness amplification methods where one is concerned with producing new problems for which 
similar classes of algorithms have much lower probability of success. 

Our proof uses intuition and ideas from the pattern matrix method developed by Sherstov [46^ 
|47] and from a related method developed earlier by Raz and McKenzie [39|. Both of these are 
hardness escalation methods for communication complexity. Each method begins with a compu- 
tational problem that is hard for a weak complexity measure, either a relation R of large decision 
tree complexity ([39]), or a function / of large approximate polynomial degree ([S]), and extends 
the problem using a "pattern matrix" to produce a problem of large deterministic (p9]), or large 
randomized and quantum ([l6]), two-party communication complexity. 

We use the /c-party generalizations of the pattern matrix method developed in |31l [TT| [14} [3]. 
Starting with /(. . . , Cj, . . .) on m variables, and a parameter k, these generalizations lift f to obtain 
another function g = Liftfc(/) = /(..., ••),.. .) on mk bit-strings. The transformation takes 
each original variable Cj and replaces it by a Boolean selector function on k bit-strings. As long as 
/ is hard in the weak measure, g is hard in the fc-player number-on-forehead (NOF) randomized 
communication complexity model (for a particular partition of the new variables). 

A key obstacle when trying to apply the pattern matrix method to the proof complexity setting 
is that the approach only works with Boolean functions, and not with unsatisfiable CNF formulas. 
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To overcome this obstacle, we associate a family of Boolean functions Zp with every unsatisfiable 
F and show that if the hardness assumption on F is satisfied then there is some function / G 2 
that has large decision tree complexity. Furthermore, if there is an efficient communication protocol 
that outputs falsified clauses in Lift(F), then there is an efficient protocol for Lift(/) for any f ^ Z. 
In this way we are able to combine the hardness escalation ideas of |39t B6] to obtain our results. 

We can also prove a converse to our result, thus completely characterizing the Th(A;) rank of 
our lifted formulas. That is, in addition to deriving lower bounds on the rank of proofs for Liftfc(-F) 
in terms of the resolution rank r of we also show that the rank of T^'^{2) proofs (and even 
resolution) proofs of Liftfc(F) is not much larger than r. 

Using the above lower bounds, we are able to prove new rank separation theorems for hierarchies 
of polynomial threshold proof systems. By considering Liftfc(-F) for certain unsatisfiable CNF 
formulas F that require large rank resolution refutations but need only small rank Cutting Planes 
refutations, we obtain strong rank separations between the power of T'^'^{k + 1) and R'^^{k), between 
Th(A:) and Th(A; — 1), and between CP(/c) and CP(/c — 1) refutations where CP(A;) is the natural 
generalization of Cutting Planes to degree k. 

Finally, using Sherstov's strengthened degree-discrepancy lemma for 2-player communication 
complexity [46j . we apply our techniques to prove optimal integrality gaps for a large family of 
optimization problems even after rounds of CP or Th(l). 

Related Work on Hardness Escalation As mentioned above, the usual form of hardness 
amplification in circuit complexity is a method of amplifying the probability of error. That is, 
the complexity class C is fixed (or nearly fixed), and the goal is to go from a function that is 
weakly hard (e.g., any circuit in C that approximates / has non-negligible probability of error) to a 
function that is much harder (e.g., any circuit in a slightly smaller class than C that approximates 
/ has error exponentially close to 1/2). 

A different type of hardness amplification that seems to be more relevant to proof complexity is 
what we call hardness escalation. Here, we start with a function / that is hard for some complexity 
class (where hard can be either worst case or average case), and we construct a g that is hard 
for a larger complexity class. Hardness escalation results have been obtained in models such as 
in communication complexity [46], sub-exponential time complexity [23], and circuit depth ( [25^ 
[T6l [39] ) . A similar concept called hardness condensing was introduced in [9] and some interesting 
results were proven for complexity classes beyond NP (with advice). 

Hardness escalation for proof complexity means starting with an unsatisfiable family of formulas 
that is hard for some class of proof systems, and constructing a related family that is hard for a 
stronger class of proof systems. There have been a few papers in the proof complexity literature 
that have implicitly used this idea. It has been observed that if some formula F requires large 
resolution width, then the xorification of F, obtained by replacing each variable by an xor of 
several variables (and then rewriting as a CNF), is hard with respect to resolution size. This idea 
has been used in many papers to obtain separations between various refinements of resolution, 
with respect to both size and space (e.g., [l9l[8l[l|). More generally, [33ll35j showed how to replace 
variables in a somewhat hard unsatisfiable formula by hard functions in order to prove hardness 
escalation theorems for tree-like proof systems, with the caveat that the allowable cuts in the proof 
are restricted to belong to a weaker class than the hard functions. In particular, this approach fails 
to give lower bounds for CNF formulas. However, for certain special families of formulas, one can 
do better. Schoenebeck [40] has shown how to obtain rank lower bounds for Lasserre proofs based 
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on resolution rank lower bounds for particular families of formulas, 

Outline of the Paper Section 2 contains definitions and preliminary results that we will need. 
In Section 3, we prove our main result, showing how to start with an unsatisfiable CNF formula 
requiring large Resolution rank, and lift it to obtain another formula that is hard with respect 
to stronger proof systems. We present two methods of lifting /, one based on the tensor selector 
function and the second one based on the parity selector function. In Section 4 we apply our main 
theorems in order to prove hierarchy theorems for several proof systems. We conclude in Section 5 
with a discussion and open problems. 

2 Preliminaries 

2.1 Functions, Search Problems and Decision Trees 

For a CNF formula F, let clauses(F) denote the set of clauses of F and let |F| denote |clauses(F)|. 
F is a t-CNF formula iff every clause contains at most t literals. 

Functional Composition For any function / on m bits and any function h on s bits, we 
abuse notation slightly and use f o h to denote the following function on ms bits: f o h := 

Decision Problems and Search Problems A Boolean decision problem over n variables is a 

function from {0, 1}" to {0, 1}. Let F be an unsatisfiable CNF formula over variables x^, . . . , x^i 
consisting of m clauses. The canonical Boolean relation associated with F is the predicate Rf{x, y), 
where a; is a vector of length n, and y is a number, 1 < y < m. RF{a, (3) is true if and only if 
a is a Boolean assignment and the clause Cp in F is falsified by a. Associated with a Boolean 
relation R(x,y) is a search problem: given x, output a y such that R(x,y) is satisfied. Given a 
Boolean relation R{x,y), we call a function g a subfunction of R if R{x,g{x)) is satisfied for every 
X. In other words, g is a particular function that solves the search problem associated with R. For 
example, for the canonical Boolean relation Rp associated with an unsatisfiable CNF formula F, 
the search problem, i^jcarch is the problem of finding a violated clause given a Boolean assignment 
to the underlying variables of F. A function g is a subfunction of Rp if for any truth assignment 
a, g{a) returns a clause of F that is falsified by a. 

Definition A decision tree on Boolean variables binary tree in which every non-leaf 

node is labeled with some Xj and has two outgoing edges that are labeled with and 1, and every 
leaf node v is labeled by some value l{v). Thus any path from the root to a leaf identifies a partial 
assignment to these variables. A decision tree T is said to compute a function / if for every leaf 
V in T, its associated partial assignment determines an output for / and is equal to (l.{v). That 
is, if £7 is a partial assignment associated with a path of T labelled by l{v), then for every total 
assignment a' extending cr, /(cr') = i{v). For a relation R, a decision tree T solves the search 
problem associated with R if it computes some subfunction of R. The height of a decision tree is 
the maximum length of any path from the root to a leaf. The decision tree complexity of /, denoted 
D{f), is the minimum height of all such trees computing /. 
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2.2 Hard functions from hard unsatisfiable CNF formulas 

Given an unsatisfiable CNF formula we will say that F is somewhat hard if the decision tree 
complexity of Fgearch) ^(-^search)i is large (superpolylogarithmic in the size of F.) It is well known 
that the decision tree complexity of -Fgearch is equivalent to the height of any tree-like resolution 
refutation of F, or equivalently to the depth of recursion of any DPLL search procedure for F [15j . 

Now given an unsatisfiable CNF formula F that is somewhat hard, we want to identify a set 
Z of Boolean functions associated with F that witnesses the hardness of F. Specifically, we want 
Z to have the property that if -D(Kjcarch) is large then Z contains a function with large decision 
tree complexity. This alone would be easy. However, we also want Z to be constructable from 
algorithms computing Fgearch- 

A natural choice for the collection of functions from -Fgearch would be to define fs{oi) = 1 for 
some S C clauses(F) if and only if there is some clause in S that is falsified by a. One might hope 
to argue that one such fs would have decision tree complexity close to that of Fgearch- The obvious 
way to try to show this would be to reason by reduction; however, it is not clear how to construct 
a decision tree for Fgearch from decision trees for such a collection of fs since both fs{oi) and /^(a) 
may equal 1. Some sort of symmetry-breaking scheme is required and this scheme must satisfy the 
property that for 5 C T we have frict) = 1 whenever fs{a) = 1. 

Definition A set Z of Boolean functions over the set of variables of an unsatisfiable CNF formula 
F is said to be a consistent system of functions for F iS Z = {fs \ S C clauses(-F)} and for any 
input assignment a there exists a clause C in F falsified by a such that for any fs £ Z we have 
that fs{a) = 1 if and only if C S 5. 

Proposition 2.1. Given an unsatisfiable CNF formula F, any function f* that is a subfunction 
of Rf (that is, it solves the search problem -Fsoarch/' yields a consistent system Zf* of functions for 
F. 

Proof Use the clause C = f*{a) and define fs{a) = 1 iff C e 5. □ 

The following proposition says that any consistent system of functions for F witnesses the 
hardness of F. 

Proposition 2.2. For any unsatisfiable CNF formula F and any consistent system Z of functions 
for F , there exists a function fs^zZ such that D(i^scarch) < D{fs) [log2 l-F]] • 

Proof. Build a decision tree for Kjearch using binary search by querying the fs for subsets S C 
clauses(-F) to narrow down the search. The requirement of consistency ensures that the path 
followed by binary search on input a yields the falsified clause C. To derive the tree for Fgearch 
replace each query of fs by the optimal decision tree for fs, yielding the claimed bound. □ 

2.3 Communication Complexity 

Given a function (or relation) /, some number k > 2 oi players, and a partition of the input of / 
for these players, communication complexity is concerned with the least amount of communication 
necessary between the players in order for them to compute an output of /. In the number- on-the- 
forehead (NOF) communication model, each player sees all inputs except the block of the partition 
that is assigned to him. For formal definitions, the reader is referred to [29]. In this paper, we will 
be only concerned with NOF randomized communication complexity. 
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Definition Let 17^1 denote the number of bits communicated in a communication protocol V and 
V{x) the output of the protocol on input x. A randomized protocol V is said to compute a function 
/ with error at most e if on any input x, with probability at least 1 — e (over the choices of players' 
random coins c), V{x,c) = f{x). 

If / is a search problem, the standard definition (e.g. [29]) of randomized communication com- 
plexity states that V computes / with error at most e if and only if for at least 1 — e fraction of 
choices of random coins c, on any input x, V{x,c) € f{x). Even for the 1 — e good choices of c, 
the values V{x,c) for different choices of c may be different elements of /(x). However, for our 
construction of hard functions from hard unsatisfiable CNF formulas, we will require a stronger 
notion. 

Definition A randomized protocol V is said to consistently compute a relation / with error at 
most e if there is a function /* contained in / - that is, f*{x) £ f{x) for every x, such that V 
computes /* with error at most e. 

2.4 Proof Systems and the Complexity of Clause Search 

A proof system for a language £ is a polynomial time algorithm V such that for all F, F £ C \i 
and only if there exists a string P, referred to as a proof, such that V accepts {F,P). If C is the 
set of all unsatisfiable CNFs, or all unsatisfiable sets of inequalities, and F €z L, then P is called a 
refutation of F. 

A wide variety of proof systems exist in the literature. In most of these proof systems, a proof 
or refutation can be expressed as a sequence of lines, each of which is either (a translation of) an 
input clause or follows from some previous lines via some inference rule. (Inference rules that do not 
depend on previous lines are called axioms.) We call such proofs standard proofs. In the case that C 
is the set of unsatisfiable CNFs or propositional tautologies, in a standard proof each line represents 
a Boolean function on the variables of the formula, and any inference of a line representing function 
g from lines representing functions fi,...,fs must be sound, in that for any Boolean assignment to 
their input variables g must evaluate to true whenever all of /i, . . . , fg do. We call the maximum 
s over all derivations in a proof its fan-in. A refutation of an unsatisfiable formula / in a standard 
proof system is a sequence of formulas, where the initial formula / is included as an axiom (or set 
of axioms), and the final formula in the sequence is the trivially false formula. 

Definition We associate a DAG Q = {V, E) with every standard proof P, where V is the set of 
lines in P and {u, v) € E ii line v is derived via some inference rule using line u. The size of P is 
the number of bits in P, which is lower-bounded by the number of lines in P. The rank of P is 
the length of the longest path in Q. We consider ^ to be a tree if every internal node has fanout 
one. (The axioms, which are not internal nodes, can be repeated.) If is a tree, we say that 
P is tree-like. The size complexity and rank complexity of F in a standard proof system are the 
minimum size and minimum rank, respectively, of all proofs for F in that system. Similarly, we 
define tree-like size complexity as the minimum over all proofs are restricted to be tree-like. 

Note that restricting a proof to be tree-like does not increase the rank of a proof because the 
same line can be re-derived multiple times without affecting the rank. Tree-like size, however, can 
be much larger than general size. 
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We first mention some of the most well-studied proof systems. In each of these systems, there is a 
set of derivation rules (which can be thought of as inference schemas) of the form Fi,F2, . . . ,Ft\- G 
and each derivation step in a proof must be an instantiation of one of these rules. 

A basic system is resolution, which manipulates clauses. Its only rule is the resolution rule: the 
clause (^4 V B) is derived from (^4 V x) and {B V ^x), where A and B are arbitrary disjunctions of 
literals and x is a variable. A resolution refutation of an unsatisfiable CNF formula / is a sequence 
of clauses, ending with the empty clause, such that each clause in the sequence is either a clause of 
/, or follows from two previously derived clauses via the resolution rule. The well-known connection 
showing that DPLL executions and tree-like resolution proofs are equivalent gives us the following 
proposition. 

Proposition 2.3. For any CNF formula F, the minimum rank of any resolution proof of F is 

equal to -D(-Fsearch)- 



where each pi is a linear form, a^, 6, c, and Aj > are integers, and Xj is a variable. In CP, we also 

have axioms < Xj and < 1, and each input clause (^i V • • • V ^j) is translated as H h 4 ^ 1 

where I' = x M i = x and ^' = {1 — x) M i = -ix. The trivial unsatisfiable formula is > 1. A 
CP refutation is a sequence of inequalities, ending with > 1, such that all other inequalities are 
either, axioms, translated input clauses, or follow from two previously derived inequalities via a CP 
rule. 

We will consider a natural extension of CP, denoted CP (A;), in which the above CP proof rules 

may also be applied when the pi are allowed to be degree k multivariate polynomials and the Xj 
are replaced by degree k monomials. Since the input clauses are linear there are two other rules 
that allows the creation of higher degree inequalities, namely: 



and 

p > 



f system which manipulates integer linear 

Pi > 0, (1) 
Xi>\h/c\, (2) 



Another proof system is the Cutting Planes (CP) proo: 
inequalities. The two rules in the CP system are: 

t 

Pi>0,...,pt>0 h 

i=l 

and 



caiXi > b \- I 



where each pi is a linear form, a^, b, c, and Aj > are inte 
have axioms < Xj and Xj < 1, and each input clause {£i ^ 
where f = x if ^ = x and f = (1 - x) if £ = ^x. The 
CP refutation is a sequence of inequalities, ending with C 
either, axioms, translated input clauses, or follow from tw( 
rule. 

We will consider a natural extension of CP, denoted C 

may also be applied when the pi are allowed to be degrc 
are replaced by degree k monomials. Since the input cla 
that allows the creation of higher degree inequalities, nan 

P > h Xip>0 

\- p > XiP 



for all polynomials p of degree at most k — 1 and variables Xi. 

Other important well-studied proof systems are the Lovasz-Schrijver proof systems (LSq, LS, 
LS+, and LS+^*) which manipulate polynomial inequalities of degree at most 2. These proofs use 
various subsets of the inference rules 

4i > 0,^12 > o,...4i > o,£t2 > 

n t s 

h^a,(xf-x,) + ^A,-£,-i^,-2+ Yl V'>0 

i=l j=l j=t+l 
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where ij,ijb are linear forms and Xi are variables, and Oj and Xj > are integers, the axioms and 
translations of clauses are the same as for CP. Thus, all Lovasz-Schrijver proof lines are degree 2 
polynomial inequalities. On can generalize these proof rules to LS^ ^ proofs in which one is allowed 
to multiply arbitrary inequalities, use xf = Xi and add the squares of higher degree terms, provided 
that each quantity in the inference is syntactically a polynomial of degree at most k. 

Each of the above proof systems has a specific set of inference rule schemas, which allows them 
to have polynomial-time verifiers. We also consider more powerful semantic proof systems which 
restrict the form of the lines and the fan-in of the inferences but dispense with the requirement of a 
polynomial-time verifier and allow any semantically sound inference rule with a given fan-in. (Each 
line is a clause or follows via some semantic inference rule.) The fan-in must be restricted because 
the semantic rules are so strong. The following strong semantic proof system was introduced in [2] . 

Definition For integer k > 1, we denote by Th(fc) the semantic proof system whose proofs have 
fan-in 2, each line is a polynomial inequality of degree at most k, and input clauses and axioms are 
represented as linear inequalities as in the definition of CP above. 

Without loss of generality via Caratheodory's Theorem, for formulas in n variables, in the case 
of CP the fan- in of inferences is at most n and in the cases of LSq, LS, and LS+, the fan- in of 
inferences is at most (n + 1)^. From this we immediately obtain the following: 

Proposition 2.4. (1) Any CP proof of size (tree-like size) S and rank r can be converted to a 
Th(l) proof of size (tree-like size) 0{S) and rank 0(r log n). (2) Any LSq, LS, or LS-|- proof of 
size (tree-like size) S and rank r can be converted to a Th(2) proof of size (tree-like size) 0{S) and 
rank 0(r log n). 

Moreover, it is not hard to show that one can extend the above simulations by Th(A;) proofs to 
CP(fc) and LSl^^. 

The Sherali-Adams and Lasserre proof systems introduce new variables for subsets of input 
variables of bounded size (which is called the rank of such proofs). Monomials of degree k represent 
the intended meaning of these added variables so Th(k) proofs of rank k also efficiently simulate 
rank k Sherali-Adams proofs and rank k/2 Lasserre proofs. 

In this paper we also consider more general semantic proof systems even than Th.{k), namely 
those for which the fan-in is bounded and the truth value of each line can be computed by a 
multiparty communication protocol. 

Definition For any k,C > 1, we denote by T'^^kjC) the semantic proof system of fan-in 2 in 
which each proof line is a Boolean function whose value, for every partition of the input variables 
into k group£], can be computed by a C-bit randomized fc-party NOF communication protocol of 
error at most 1/3. Both k and C may be integer functions of the input size of the formulas. In 
keeping with the usual notions of what constitutes efficient communication protocols, we use T^'^{k) 
to denote the union of all T^^{k, C) over all C in log*^^^-* n. 

Note that via standard boosting, we can replace the error 1/3 in the above definition by e at the 
cost of increasing C by an 0(logl/e) factor. Therefore, without loss of generality, in the definition 
of T^'^{k) we can assume that the error is at most l/n^°^"'^'". 

^We note that one could alternatively define T'^'^(k, C) systems based on a fixed partition of the inputs. While this 
definition might yield a stronger proof system, it would complicate the notation without changing our results in any 
significant way. 
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Note also that a semantic proof of rank r that satisfies the same conditions as a T'^'^(k, C) proof 
except that it has rules of fan-in at most t > 2 can be simulated by a T^'^(A;, 2Ct log2 t) proof of 
rank r log2 1 by replacing each inference by a binary tree of height log2 t in which lines of internal 
nodes are conjunctions of their predecessors. 

For polylogarithmic k, the following lemma shows that Th(/c) is a subclass of T^^[k + 1). 

Lemma 2.5. For some constant c > 0, every Th(A;) refutation of a CNF formula on n variables 
is a T'^'^{k + 1, ck^ log^ n) proof. 

Proof. By the well-known result of Muroga [M] , linear threshold functions on n Boolean values only 
require coefficients of 0(n log n) bits. Since a degree k threshold polynomial is a linear function 
on at most monomials, it is equivalent to a degree k threshold polynomial with coefficients of 
0(A:n'^ log n) bits. As shown in [2], over any input partition there is a randomized {k + l)-party 
communication protocol of cost 0(A;log^s) and error < 1/s^^^^ to verify a degree k polynomial 
inequality with s-bit coefficients. □ 

We also define another class of proofs based on /c-party communication complexity that we will 
see is even more general than T^^{k, C). 

Definition For any integer functions k,C > 1, we denote by R^'^{k, C) the semantic proof system 
of arbitrary fan-in in which each proof line is a Boolean function such that the proof satisfies the 
following property: for every partition of the input variables into k groups, and every inference of 
B from Ai, . . . , As in the proof, there is a C-bit randomized /j-party NOF communication protocol 
of error at most 1/3 that computes a (partial) function fAi,...,As^B from the inputs to the set [s] 
such that on every input a, if i? evaluates to false on input a then Af^ ^ hs(a) evaluates to false 
on input a. 

We write W^{k) to denote the union of all R^^{k, C) over all C in log*^^^^ n. 

The following is immediate: 
Lemma 2.6. Every T'^^{k,C) proof is an R^'^{k,C) proof. 

Proof. The inferences in the T'^^{k, C) are all of fan-in at most 2 and hence derive each line B from 
some lines Ai and A2. To compute the function fAi,A2hB the players evaluate Ai on input a using 
the protocol given by the T^^{k, C) proof. If that evaluates to false then they output 1; otherwise, 
they output 2. □ 

We can sharpen this relationship further. The following is a standard method for strengthening 
a proof system S by adding resolution rules over the lines of S . 

Definition Given a proof system S, we define related proof system R{S) as follows: Lines of 
R{S) are unordered disjunctions of lines of S and their negations. For every inference rule in S, 
Ai, . . . ,At h B, there is the corresponding rule {G V ^1), . . . , (G V At) h {G V B) where G is an 
arbitrary disjunction of lines of S and their negations. In addition there are extended resolution 
rules that allow the introduction of new disjuncts, G h (G V j4i V ... V At), or cuts on lines of S, 
namely {GV A), [H V -^A) h (G V H), where A is a line of S and G and H are arbitrary disjunctions 
of lines of S and their negations. 

Lemma 2.7. Every R{T'^'^{k,C)) proof is an R'^'^{k,C) proof. 
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Proof. For rules that correspond to rules of T^'^{k, C) we apply the simple argument from Lemma [2.6l 
on the lines that are not common to all formulas. For the resolution rules, observe that the players 
only need to evaluate the line A to determine whether to select [G V A) or {H V -^A). □ 

In particular, this shows via Lemma [2.5l that R'^'^{k+\, ck^ log^ n) proofs include the proof system 
i?(Th(A;)) (suggested by Hirsch). It is not clear whether one can efficiently simulate -R(Th(A;)) using 
T^'^{k) proofs. 

The following lemma, which is implicit in [2j, gives the key relationships between T'^^{k) and 
R'^^{k) proofs and randomized communication protocols that consistently compute -Fsearch- 

Lemma 2.8. Let F be a CNF formula in n variables and e > 0. 

(i) If F has an R^'^{k,C) refutation of rank r then, over every partition of the variables, there is 
an e-error randomized k-party communication protocol V consistently computing Kjearch such 
that \V\ is 0{Cr\og{r/e)). 

(a) If F has a tree-like T^'^{k,C) refutation of size S. then, over every partition of the vari- 
ables, there is an e-error randomized k-party communication protocol V consistently comput- 
ing Fsearch suck that \V\ is 0(dog S log(log S'/e)) . 

Proof. First assume that we have a rank r refutation in R^'^(k,C). On input a, the k players 
backtrack from the last derived inequality in the proof (0 > 1) to find some clause that is falsified 
by a. When they are at a line B that follows from lines Ai, . . . , Ag in the proof, they run the 
protocol for fAi,...,As\-B^ implied by the R'^^{k,C) definition for the inference at B, 0(log(r/e)) 
times and take the majority answer to reduce its error below e/r. Then the players move to the 
line indicated by that answer. The probability that this protocol makes an error is at most the sum 
of all error probabilities on any path in the proof. Since the last line evaluates to false on input a, 
in the case that there is no error the players will return a fixed clause in the proof that is falsified 
by a, which implies that they consistently compute ^search- 

For the second case of a size S tree-like refutation, there is some line in the refutation that is 
derived from between 5/3 and 25/3 of the lines of the refutation tree. The players first evaluate 
that line with error at most e/(21og2 5) by repeating the protocol 0(log(log 5/e)) times. If the line 
evaluates to false then they continue within that subtree; otherwise, they remove the nodes of that 
subtree. This is done recursively until a falsified clause is found. The depth of recursion is at most 

2 log2 5. The rest is similar to the first case. □ 

3 Hardness Escalation for CNF formulas 

This section proves our results on hardness escalation. The high level idea is as follows. Recall that 
an unsatisfiable t-CNF formula F is somewhat hard if i^search requires a large height decision tree. 
Starting with a somewhat hard unsatisfiable t-CNF formula F over variables ei , . . . , , we build 
a new CNF formula G = Lift(i^) of size nnP^^^ by lifting F based on some function ij: that encodes 
Ci using a larger collection of input bits. This lifting over CNF is adapted from previous work for 
Boolean functions, which we review next. 
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3.1 Lifting Decision Tree Complexity to A;-Party Communication Complexity 

In this section, we show how to lift a Boolean function / to obtain another function g '■= f o 
for some Boolean function •i/^^, where V'fc can be thought of as a simple encoding of a variable of / 
using some number of new variables. 

Definition Given /c, s > and a domain A, a function iIjj^ : {0, 1}^ x A'^ {0, 1} is called a 
selector if there is some h : A^ ^ [s] such that ipk{x, yi, . . . , yk) = for every x G {0, 1}* 

and Ui G A. Informally, ■i/'fc outputs a bit in x that is selected by the values of yi, . . . , y^- 

There are two specific selector encodings ipk that we are interested in: the tensor selector ^/jJ^ 
and the parity selector V'^fa- In the tensor selector ■ip'^g{x,y), we have s = £^ and A = [£], and we 
think of X G {0, l}** as indexed by A^ and hence h{-) is just the identity function on A^. In the 
parity selector ipf^i^, y), we have s = 2", A = {0, 1}", and we think of x as indexed by a-bit arrays 
and h{yi,...,yk) = yi® ■ ■ ■ ®yk- 

Given our initial function / over variables x, we define the {k + l)-lifted version of /, to be 
the function f o ipj^. 

It is not hard to see that if the decision tree complexity of / is d, then for any k > 2, and over 
any partition of the variables into k groups, there is a /c-party communication protocol computing 
g of cost approximately d ■ c, where c is the cost of computing ipk- The k players just simulate 
the decision tree for / and the cost of computing any single variable in / encoded by ipk is c bits. 
If ipk is simple enough, and therefore c is negligible, then this cost is approximately equal to d. 
Ideally, we would like to argue that this is the best that the players can do. Intuitively, since we 
have encoded each input bit in / indirectly, the players need to communicate 1^(1) bits in order to 
be able to "learn" any single bit. If the decision tree complexity of / is large, we would hope that 
g has large communication complexity. Recent results in communication complexity show that we 
cannot do much better than the above trivial protocol, subject to some constraints on ipk- 

We need the following approximation notion to bridge decision tree complexity and communi- 
cation complexity; this notion of approximating a real-valued function is polynomially related to 
decision tree complexity. 

Definition Given any < e < 1, the e-degree of a real-valued function /, deg^^f), is the smallest 
d for which there exists a multivariate real- valued polynomial p of degree d such that ||/ — p||oo = 
maxx -p{x)\ < €. 

Proposition 3.1 ([Mill])- For every Boolean function f, deg^/Q{f) < D{f) < {Adeg^/Q{f))^ . 

Finally we state the communication lower bounds for g = f oipf,. The following input partition 
is always assumed when the communication complexity of g is discussed: there are k + \ players 
and for each input (x, yi, . . . , yk) to each -0^, player is assigned x, and each player i, for 1 < i < k, 
is assigned yi . Intuitively, the inputs yi, . . . ,yk given to players 1 through k determine which bits 
of x (player O's input) are given to /. The next two results say that, when ipk is either or 
Tpka^ encoding t/^k is over a large enough number of new variables, then the communication 

complexity of g is polynomial related to D(f) (up to a factor depending only on k). 

Theorem 3.2 ([llj). Let f : {0,1}™ ^ {0,1} with 5/6-degree d > 2. If £ > ^''^^^'^"^ then 
any {k + 1) -party communication protocol V computing g = f o tp"^^ with error 1/3 must have 
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Theorem 3.3 ([3j). Let f : {0,1}"" ^ {0,1} with 5/6-degree d > 2. If l"" > ^ ^— ^ then 

any {k + \)-party communication protocol V computing g = f o i^'ka '^^'^^ error 1/3 must have 

The first theorem uses the tensor selector while the second uses the parity selector. We will 
use both of them to prove lower bounds for T^^[k) and R"^{k) proof systems. The parity selector 
has an advantage that it needs fewer bits to encode each variable. Thus, it will give stronger proof 
complexity lower bounds as a function of the number of variables of the formula (though it is no 
more efficient with respect to formula size). In contrast, the advantage of the tensor selector is 
that a CNF formula that is lifted based on the tensor selector is easier to refute by small degree 
threshold proof systems so we will be able to use this selector to prove rank separations for the 
hierarchies of T'^{k) and R^'^{k) proof systems. 

Overvievif of the Hardness Escalation Argument 

Before giving the formal construction and proofs for the two selectors, we present a brief overview 
of our argument. Let F be any t-CNF over the variables ei, . . . , Cfn- We want to describe how to 
lift F to obtain another unsatisfiable formula G, where now G is harder than F. Every variable 
ei of F will be replaced by a set of variables Vi. The Vi variables be comprised of A; + 1 sets of 
variables: x, and yi, . . . ,yk- As in the previous section, there will be a selector function ipk which 
will use the y variables to select one x variable to represent Cj. The clauses in G will state that the 
Vi variables represent a valid ■i/'-encoding, and that with respect to this encoding, F is true. We 
want to show that G is even harder than F, i.e., that G requires large T'^'^{k) rank. By Lemma 
2.8, we know that if G has low T'^{k) rank, then there is an efficient A;-party protocol for solving 
the search problem associated with G, Gsearch- Thus to prove a T^^{k) rank lower bound for G, it 
suffices to prove that Ggearch is hard in the fc-party NOF model. 

Now any function associated with G is also a lifting of the corresponding function associated 
with F. In particular, Ggearch = -^search ° ipk- The intuition for why it should be hard is similar 
to that of the lifting of Boolean functions: here Ggearch is a lifting of Kjearch) and the decision tree 
complexity of -Fgoarch is large. To prove this, assume for sake of contradiction that Ggearch is easy 
for fc-party communication. Then the k players can efficiently compute Gsearch over the variables 
Vi. This in turn means that given the variables Vi, they can efficiently compute -Fsearch(ei, . . . , em), 
where each = ipk(yi)- It follows that there exists a consistent system Z of functions for F such 
that for any function fs G Z, the players can easily compute fs °ipk- In other words, the lifting of 
any fs € Z is easy for /c-party communication. It then follows that for appropriate choices of ipk, 
any function in Z has low decision tree complexity. Then by Proposition 12.21 we can conclude that 
the decision tree complexity of -Fgearch is small, contradicting our assumption. We now proceed to 
the formal arguments for each of the selector functions. 

3.2 Hardness Escalation Based on the Tensor Selector 

Let F be any t-CNF over the variables ei, . . . , Cm- Parametrized by kj > 2, G = UftJ (,{F) IS a 
CNF formula defined over m sets of variables Vi, . . . , Vm , where each Vi is further partitioned into 
two sets Xi of size i'' and Yi of size ki. Intuitively, every Vi is an encoding of ej based on ipke- Each 
Xi represents a A;-dimensional tensor of size each of whose cells c is associated with a variable 
Xi^c £ Xi. Yi, which is indexed as {yi^p^a '■ ^ 1^ P 1^ k,l < a < i}, selects a unique cell c in this 
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tensor as follows: For each p £ [k], exactly one of the variables ?/j,p,a for a € [£] is true, and the 
value Up such that Ui^p^ap is true is the p-th coordinate of c. Every clause in F is then transformed 
into a set of clauses over these Vi. Formally, the clauses in G consist of: 

• For 1 < i < m,l < j < k, exactly one of yj,p,i, . . . , yi^p/ is 1: 

(I) yi,p,i V • • • V Vi^p^e 

(II) {1 < a < a' < £): ^yi^p^a V ^yi,p,a' 

• For every clause, say -iCjj V ei^2 V ■ ■ ■ V e^^, in F and for every t-tuple of cells (ci, . . . , q), 
if Yi^ selects ci, Yi^ selects C2, etc., then ^Xi^^^ V Xi^^c2 • • • V Xi^^ct must be satisfied: this is 
translated into one clause tk + t literals. For example, if the coordinates of ci, . . . , q are 
(a}, . . . , a^), . . . , (a^, . . . , a\.), respectively, then the clause would be: 

(III) ^Vhxai V • • • V -yi,,fc,ai V • • • V ^yi,^i^a\ v • • • v ^yi,^k,ai v -a;^,^ v 3:^2,^2 v • • • v Xi.^ct 

The next proposition shows that as long as the clauses of F are not too large, then G is also 
not too large, and that G is unsatisfiable as long as F is. 

Proposition 3.4. If F is a t-CNF over m variables, then G = Lift^^(F) is a CNF formula of 
\F\£''^ + 0{mki'^) clauses of size at most max{tk + t,£} over n = m{£^-\-ki) variables. Furthermore, 
if F is unsatisfiable, then so is G. 

We say that an assignment to Xi,Yi, . . . , Xm, Ym of G is a valid encoding of an assignment to 
variables ei, . . . , of F if all clauses (I) and (II) are satisfied and for every i, Xi^c = Si where c is 
selected by Yi. 

We fix the following input partition to A; + 1 players when discussing the communication com- 
plexity of Gscarch: player is assigned all of the Xj's, and each player p, for 1 < p < k, is assigned 
{yi,p,a : I < i < m,l < a < £}. 

The following lemma says that if Gsearch is easy in communication complexity, then there exists 
a consistent system Z = {fs : S C clauses(F)} for F such that for every fs € Z, computing 
fs ° e is also easy in communication complexity. 

Lemma 3.5. Given any unsatisfiable t-CNF formula F and G = Lift^£(-F). Suppose that there 
is a {k + l)-party communication protocol V consistently computing Ggearch with error e such that 
\V\ < G. Then there exists a consistent system Z = {fs : S C clauses(F)} of functions for F 
such that for every S, there is a {k + \)-party communication protocol Vs consistently computing 
fs ° i^'ke ^^^^ error e such that \Vs\ < C. 

Proof. For every input assignment a to F, we fix any input assignment to G that is a valid 
encoding of a. Let g* be the subfunction of Ggearch that is computed by V. 

We first observe that on any input assignment a and , g*{a^) always outputs a type (III)- 
clause. This is because is a valid encoding. This clause corresponds to a unique clause in F 
that is falsified by a. Thus g* uniquely determines a subfunction /* of Fsearch- 

Given /*, we define the consistent system Z = Zj* for F using the construction in Proposi- 
tion [2Tl For every fs € Z, the protocol Vs for fs o -i/^J^ is adapted from V in the straightforward 
way. □ 
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The next theorem is our main result on the proof complexity of G which glues all the parts 
together. 

Theorem 3.6. There are absolute constants c^d > such that the following holds. Let F be 
any t- CNF formula on m variables having resolution rank at least r and let G = Lift^^(F) for 

^ ^ (r%s\F\?/^ - ^^^^ C andM = c'(r/ log2 |F|)i/6/(C2'^), 

• any R'^^{k + 1, C) refutation of G of rank R must have -Rlog2 R > M , and 

• any tree-like T^'^[k + 1, C) refutation of G of size S must have log S log log S > M . 

Proof. We will prove only the first part, with the second part follows similarly. 

Let P be a R'^^{k + 1,C) refutation of G of rank R. Lemma |2.8|. there exists a. (k + l)-party 
protocol V consistently computing Gsearch of error 1/3 such that \V\ is 0{GRlogR). 

Now on the one hand, by Lemma [3?5l there exists a consistent system Z = {fs : S C clauses(F)} 
of functions for F such that for every S, there exists a (A; + l)-party protocol Vs computing fsoip"^^ 
of error 1/3 such that \Vs\ is 0{GRlogR). 

On the other hand, by Proposition 12.31 the assumption on the resolution rank of F implies that 
-C'(Kicarch) ^ ^- By Proposition 12.21 there exists a function fs^Z such that 

D{fs) > ^^^'''"^'''^^ - ^ 



riog2 iFii riog2 iFii ■ 

By Proposition EH we have d = deg^j^ifs) > {D{fs)Y'V^ > iT^)'^V'^- 

Finally, by Theorem[321 we must have Ci2 log i? that is 17((i/2*=) which is 0((r/log2 \F\y/^/2''). 

□ 

We note that we have a somewhat matching upper bound on the rank complexity of G. 

Lemma 3.7. Let F be a t- CNF formula on m variables having resolution rank r. There is some 
absolute constant c > such that for any £ > 1, there is an T^{2^ log2(rA:£)) proof of G = Lift^^(F) 
of rank at most crk log2 i- 

Proof. The main idea is to first build a decision tree for Gsearch using the decision tree for Fgearch- 
The key idea in the search is that for every Cj there is precisely one variable a;i,(ai,...,afc) for 
afc € [i] whose value will replace that of ej in evaluating G. This selection is determined by 
the one tuple for which all of yt^i^an ■ ■ ■ : yi,k,a^; evaluate to 1. 

Whenever the decision tree for i^search queries a variable Cj, the decision tree over linear in- 
equalities for Gsearch docs k binary searches where the p-th one queries inequalities of the form 
^a&[j j'] yi,p,a > 1 to find the unique ap such that yi^p^ap = 1- At the leaf of this search correspond- 
ing to the tuple (ai, . . . , a^), the query of ej is replaced by a query to a;i,(ai,...,afc)- 

It remains to convert this decision tree to an T^^{2) refutation. We follow the standard con- 
version of decision trees to proofs implicit in the equivalence in Proposition 12.31 Each node of the 
decision tree for Gsearch will be a line in the new proof. Each such node v' is associated with a node 
V of the derivation for .Fsearch which also corresponds to a line in the resolution refutation of F that 
is some clause Cy on the variables. The line labelling v' in the proof of G will consist of a disjunc- 
tion of several literals and one polynomial inequality associated with the current binary search. In 
particular, for each literal in C^, if the branch on which v' lies has determined that Xi^(^ai,...,ak) 



15 



the selected literal to replace Cj then the disjunction will include -'yj,i,ai V. . - ^ ^yi^k^af,^"^ x\ (^^^ 
whose negation indicates the selection and the value substituted for Cj. However, at node f', only 
part of the next level of search may be completed. Suppose that p — 1 < k binary searches are 
completed at v' for the current branch variable e^/. In this case we add -'?/i',i,ai V ... V ~'yi',p-i,ap_i 
to the disjunction at v' . Finally, if the current binary search has been restricted to a range 
then we add one more disjunct: the linear inequality Ylae[j j'] Vi',p,a < 0. (If j = j' this is equivalent 
to ^yi\pj-) (Note that in moving down the proof tree, as we start a binary search we have no such 
linear inequality disjunct but we can add X^ae[£] yi',p,a < via the axiom on the selector.) 

It is clear that the proof tree is binary and each line can be viewed as a disjunction of two 
linear inequalities on at most rki binary values which can be evaluated efficiently by a 2-party 
randomized protocol. □ 

Corollary 3.8. Let t be some constant. Suppose that a family of polynomial- size t- CNF formulas 
F on m variables has resolution rank complexity r = r{m) that is m^^^\ Then, for every constant 
e > and k > I, there is a family of CNF formulas G = LiftJ(F) on n variables of size n*^*^*) such 
that if k < (1 — e) log log n then 

• G requires W^{k + 1) refutation rank complexity il(r^/'^) = n^^^^^^ ; 

• there is a T^^{2) refutation of G of rank 0(r log n); 

• G requires T'^^(k + 1) tree-size exp{n^^^/^^). 

Proof. Apply Theorem 13.61 with £ being the least integer satisfying the constraint; that is, £ = 
-^-^ for some constant c" > since l-Fl is polynomial in m. 

(rf logm.)'-'° ' ' 1^ J 

By Proposition 13.41 the resulting formula G = LiftJ(F) has n = m{l^ + kt) variables. Now for 
k < {\ — e) log log n and since r is m for sufficiently large n we have £^ + k£< 2'^ ^{c"mf < 

Also by Proposition [331 1^1 is 0{\F\£^^ + mkl"^), which is n<^(*) 

Suppose that there is a T'^'^{k + 1) refutation P of G of rank R. Hence by definition, there is 
some constant /3 > such that P is a T^'^{k + l,log^n, 1/3) refutation. By Theorem 13.61 we have 
that RlogR is Vt{{r / log mYl^ / {2^ log^^ n)). Thus for sufficiently large n, R is n{r^/'^) = n^(iA) 
since r is mp^'^^ 

The rank upper bound follows easily from Lemma 13.71 and the proof for the tree-like size lower 
bound is similar. □ 

In particular, by Proposition 12.51 Corollary 13.81 applies to all Th(A;) proof systems. 



1 Sf„i/^\k £qj, gome constant 6 > 0. It follows that n < ^m{c"m)^ and hence m is TnP'^^l^\ 



3.3 Hardness Escalation Based on the Parity Selector 

Let F be any t-CNF over the variables ei, . . . , e^. Parametrized by /c, o > 2, Lift®^(P) is a CNF 
defined over m sets of variables Vi , . . . , Vm , where each Vi is further partitioned into two sets Xi and 
Yi. The difference here with Lift®^(P) is that every Vi is an encoding of based on That is, 
each Xi has 2" variables that are indexed by a-bit vectors, each Yi = {yi^p^ ■ I < p <: k,l < b < a} 
has ka variables, and each Yi selects a unique a-bit vector c with Ch = ®p=iyi,p,fe, for 1 < fe < a. 
The clauses in Lift®^(F) consist of: 
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(*) For every clause, say e^^ V ■ ■ ■ Vej^, in F and for every t-tuple of a-bit vectors (ci, . . . , q), if 
selects ci, Yi^ selects C2, etc., then Xj^^ci V • • • V Xi^^ct must be satisfied. For every clause and 
t-tuple, this is translated into < 2*^^° clauses of size tka + t in the straightforward way. That 
is, there are < 2^"" assignments to the bits in 1^^ that make them select ci, and similarly for 
1^2 5 etc. There is one clause, similar to the clauses of type (III) in the tensor selector case, 
corresponding to each such assignment. 

Proposition 3.9. If F is a t-CNF over m variables, then G = Lift®^(-F) is a CNF formula of at 
most 

|^|2tfca+ta clauses of size at most ka + t over n = m(2° + ka) variables. Furthermore, if F is 
unsatisfiable, then so is G. 

The rest of the proofs for this section are very similar to those in the last section. The first 
difference is that since gives a more efficient encoding than V'J^; the blow-up in the number 
of variables of G is significantly reduced. The second difference is that, here, G has a small rank 
resolution refutation, as opposed to a GC{2) refutation in the last section when the lifting was 
done using tensor-encoding. There, small rank resolution refutation was impossible because the 
final clauses were too large. 

The proof of the following theorem, which lower bounds the proof complexity of Lift®^(i<') in 

terms of that of F, is identical to that of Theorem 13.61 except that Theorem 13.31 for is used in 
place of Theorem 13.21 

Theorem 3.10. There are absolute constants c, c' > such that the following holds. Let F be 
any t-CNF formula on m variables having resolution rank at least r and let G = Lift®^(i^) for 

2" > (f/iog|FDV6 - Then for any G and M = c'{r/log^ |F|)i/6/(C2^), 

• any T'^^{k + 1, C) refutation of G of rank R must have i21og2 R > M , and 

• any tree-like T^'^[k + 1, C) refutation of G of size S must have log S log log S > M . 

On the other hand, one can upper bound the rank complexity of Lift®^(-F) in terms of that of 
F, even in resolution. 

Lemma 3.11. Let F be a t-CNF formula on m variables having resolution rank r. There is some 
absolute constant c > such that for any a>l, there is a resolution refutation of G = Lift®^(-F) 
of rank at most crka. 

Proof. It is straightforward to construct a decision tree for Ggearch given one for Fgearch- Whenever 
a variable Cj in F is queried, the decision tree for Ggearch rnakes ka queries to the ka variables in Yi 
to find the selected Xi^c whose value replaces Cj in evaluating G. Thus the depth is multiplied by 
0{ka). ' □ 

Corollary 3.12. Let t be some constant. Suppose that a family of polynomial size t-CNF formulas 
F on m variables has resolution rank complexity r = r{m) that is m^^^\ Then, for every e > 
and k>l, there is a family of CNF formulas G = Lift®(F) on n variables of size rfy*-^) such that 
if k < {1 — e) log log n then 

• G has R^'^{k + 1) refutation rank complexity Q.{r^^'^) = n^*^"*^^; 

• there is a resolution refutation of G of rank 0[rk\ogn); 
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• G requires T^'^{k + 1) tree-size exp(n^'^^)). 

Proof. We apply Theorem 13.101 with a being the least integer satisfying the constraint; that is, 2" 
is '^(y. I logm)^!^ some constant c" > since is polynomial in m. 

By Proposition 13.9^ n = m(2" + ka). Now for k < {1 — e) log log n and since r is m^^^\ for 
sufficiently large n we have 2'' + ka< c"2^ m< c"n} for some constant 6 > 0. It follows that 
n = m{2°' + ka) < c"v}~^m? and hence m is Also by Proposition EJl |G| is |F|2*^"+*"(A;a + t), 

which is rP^^^\ 

Suppose that there is a T'^^[k + 1) refutation P of G of rank R. Hence by definition, there is 
some constant /3 > such that P is a T'^{k + 1, log^ n, 1/3) refutation. By Theorem EIOl we have 
RlogR = Q{{r / logm)^^^ / (2'^ log^n)). Thus for sufficiently large n, R = Q{r^/'^) = n^(^) since r is 

The rank upper bound follows easily from Lemma [3 . 1 1 1 and the proof for the tree- like size lower 
bound is similar. □ 

In particular, by Proposition 12.51 Corollary 13.121 applies to all Th{k) proof systems. 

4 Rank and Tree-like Size Separations of the Proof System Hier- 
archy 

In this section we separate R^'^{k) and CV{k) in terms of rank and tree-like size, thereby separating 
R'^'ik + 1) from and r^'=(A; + 1) from r'=^(A;). The main idea is that if an unsatisfiable t-CNF 

formula F has a small rank CP proof, then we will show that G = Lift^_]^^(P) has a small rank 
CP(A:) proof (that can be made small and tree-like). Moreover, if F requires large resolution rank, 
then with the right parameters, G has no small rank (or small tree-like) T^'^(k) proof. Thus G is a 
separating instance. 

The pigeonhole principle is known to be hard for resolution but admits a small rank CP proof. 
Since we need the clauses of the input formula to be of constant size for the size of the formula 
hift^^{F) to be polynomial, we use the following generalization of the pigeonhole principle 

Let G = (U U V, E) be any bipartite graph, where U represents the pigeons and V the holes 
and associate a variable < e(^u^^) < 1 with each edge {u,v) € E. (/—PHP consists of the following 
clauses, which have been translated to inequalities: 

(P) for all n G [/ : J2(u,v)&E H^^v) > 1 

(H) for all u ^ u' G U, V G V s.t. {u, v), (n', v) G E: e(^u,v) + ^{u',v) ^ 1 

Proposition 4.1 (^). For every n, there is a bipartite graph Q = {U\JV,E), where \U\ = |V^| + 1 = 
n and the degree of every vertex in U is < 5, such that ^— PHP is a polynomial size 5-CNF on 
m = 5n variables and requires resolution rank Q(m). 

From this we immediately obtain a rank lower bound for a lifting of t/— PHP. 

Lemma 4.2. There is a family of bipartite graphs Q and a family of polynomial- size CNF formulas 
Lift^_i(a-PHP) on n variables that requires refutation rank n^^^^^^ and tree-like refutation size 
exp(n^(^/'^^) in any R^'^{k) systems for any k < (1 — e)loglogn where e > is some absolute 
constant. 



18 



Proof. Let G be as given by Proposition 14. 1[ Then C/— PHP has hnear resolution rank. The lemma 
follows from Corollary 13.81 □ 



Our upper bound for the lifted versions of ^— PHP will be derived from the following CP rank 
upper bound for ^— PHP itself. 

Proposition 4.3 ([7J). For any G = {UUV,E) with \U\ = \V\ + 1, ^-PHP has a CP refutation 
of rank 0(log \U\). 

Before considering the lifted versions of ^— PHP directly, we first give a generic method for easily 
deriving some convenient CP(fc) consequences for lifted formulas. Suppose that F has variables 
ei, . . . , Cm and let G = L\ft^_i{F). The variables in G are Xi^c (recall that each cell c is indexed by 
a tuple in [^J'^"^) and yi^p^a, where 1 < i < m, 1 < p < k — 1, and 1 < a < £. 

For each variable of F define a degree k polynomial Oi as 

c=(ai,...,afe)eM'=-l 

where 

yi,c • — yi,l,a\ ' yi,2,a2 ' ' ' yi,k—l,ai^_i- 

We show how to convert the original axiom clauses (I), (H), and (HI) in G into the following 
forms that are easier to manipulate in CP(/c) systems: 

(r) for all 1 < i < m: ^ce[£]fc-i yi,c > 1 

(H') for all 1 < i < m and c / c' G [£]''~'^: yi,c + yi,c' < 1 

(nr) for all clauses in F, say -le^^ V e^j V • • • V ej^, and for every t-tuple of cells (c"^, . . . , c*), 

yii,cia;ii,ci + yi2,c2(i - a;i2,c2) + • • • + y^.c* (i - Xi.^c^) <t-i 

Lemma 4.4. For any CNF formulas F and G = Lift^_i^(F) for any k,i > 2, there are CP(A;) 
derivations of rank k of all (F), (IF), and (IIF) inequalities as well as < yi^c ^ 1 (md < Oi < 1 
given the families of clauses (I), (II), and (III) in G. 

Proof. Note that the CP(A;) rule that (g > 0) h {xiq > 0) for all polynomials q of degree at most 
k — 1 and variables Xi implies that if we have inequalities qi > bi and q2 > 62 such that the sum 
of the degrees of qi and q2 is at most k then qiq2 > 6162 is derivable in CP(A;) in rank at most the 
minimum of the degrees of qi and q2- 

The facts that < ei and < yj^c then follow immediately in rank k — 1 from < yi^p^a and 
< Ci. 

The (I) axioms in G of the form y a£[e]yi,p,a translate to Y2a€[£] yi,p,a — ^- ^PP^Y^^S this product 
rule of CP(fe) for p G [k — 1] we multiply all of the inequalities together in total rank A; — 2 to obtain 
inequality (!') above. By the product rule and Xi^c < 1, in rank A: — 1 we obtain Xj^cyi.c ^ yi,c for 
all i, c and combining with (!') we obtain that ej < 1 for all i. 

To obtain an inequality of type (H'), consider some index j such that Cp ^ c'p. We have the 
translation of the (H) axiom of G {^yi^p^cp V -^yi,p,c0 which yields yi,p,cp + 2/i,p,cj, < 1- Since we also 
have y < 1 for every variable y, by applying the product rule k — 2 times we have yi c < yi,p,cp and 
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yi,c' ^ yi,p,c'p aiid thus (ir) follows immediately. (The weaker constraint that yi^c < 1 is also an 
immediate implication.) 

To obtain an inequality of type (HI'), we use the translation of the (III) clauses of G of the 
form V*^i(vJ-J-y.^^p^^, Va;,^,,,) which is E5=i(E5=I(1 " yi,,^,^) + ^*„c^) > 1- Observe that in 
rank k by the product rule and < x^.^^j < 1 we can derive 

fc-i 

p=l 

Therefore we have 

k-l 

Zl(^ - y^„P,ci^ + ^b.c^ <k-k{l- x,^.,,.)yi.,eJ- 
p=i 

Plugging this into the original inequality we obtain that X]j=i(A; — k{l — Xi. ,,j)yi. f.j) > 1- Dividing 
everything by k and rounding up yields Yl]=ii^ ~ ~ ^jj,cj)yij,cj) — ^- Rewriting, we derive 
^*^]^(1 — Xj^, )yij^cj ^ * ~ 1 which is the corresponding inequality (IIP). 

Thus in rank k we can derive all the inequalities (P), (IP), and (IIP)- D 

We now have the tools to derive an upper bound on the CP(A;) rank of lifted ^— PHP formulas 
and complete the rank separation. 

Lemma 4.5. For any Q = {U UV,E) with \U\ = \V\ + 1 and the degree of every vertex in U is at 
most t, G = LiftJ_]^ £(^— PHP) has a CP(/c) refutation of rank 0(log \U\+tk log I), for any k,i > 2. 

Proof. For ease of notation, we denote the variables in F = ^— PHP as ei, . . . , where m = \E\. 
The idea is that we will simulate the CP-refutation for F in Proposition 14.31 by replacing each 
variable in the proof with the degree k polynomial ej (together with the associated degree k — 1 
polynomials yi^c) using the inequalities (P), (IP), (HP) and < ei < 1 from Lemma |4.4[ The rank 
of the new refutation given these degree k inequalities will be the same as that of the CP-refutation 
of F. 

By Lemma 14.41 there are rank-/c derivations of all the axiom inequalities in F (consisting of 
< ej < 1 and (P) and (H) inequalities) with Cj replaced with ei, given the original axiom clauses 
(I), (II), and (HI) in G (as defined in Section [3. 2p so there will be a CP(A;)-refutation of G of rank 
k + r, where r is the rank of the CP-refutation of F. 

Claim 4.6. Given inequalities (r),(ir), and (IIF), for all (P)-type axioms Cjj -|- . . . + e^j > 1, for 
some t > 0, inF = Q—Fi{P, the inequality + ...+ e^^ > 1 is rank-0{tk\ogt) derivable inCF[k). 

Proof. Denoting c '■= ^^i.cyi.c, our goal is to derive 

given that, from type (HP) inequalities, for every f-tuple of cells (c^, . . . ,c*), 

yii,ci + • • • + yit,ct <t-l + Zi^^i + • • • + Zit,ct, 



fc-i 



lj,p,Ci 



+ (1 



> k{l 
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and, from type (!') axioms, for each i E {ii, . . . ,it}, 

E yi.c>i. 

We will proceed in t steps, where at step j, we will derive, for every t — j-tuple of cells . . . , c*), 

(•S'j yij+i,cj+i + --- + yit,ct < + ( Zii,c + ---+ Zi.,c) + (zi.^^_ej+i + --- + Zit,ct)- 

Thus we will be done at the end of step t. Now, to proceed by induction, assuming we have finished 
step j and now at step j + 1. First, for every t — j — 1-tuple of cells (e'"''^, . . . , c*), we add together 
all 5j-inequalities for all cells c^~^^ € and replace Ylc&[e]'=~'^ ^ij+i.c ^ 1 to g^t 

i + ^'"'(yij+„cJ+- + --- + yH,ct)<^'-'(t-j-i)+ E ^w.c 

+ ^"^"'E ^ii'- + • • • + ^ij.-) + ^''"'(V2,cJ+- + • • • + Zit.ct)- 

Next we add ^'^"-'^ — 1 copies of J2ce[e]''-^ ^ij+i.^ the right side and divide by l'^'^ to get an S'j+i 
inequalities. Each step requires rank 0{k log £) with fan- in 2. The claim follows. □ 

Claim 4.7. For a// (H) axioms e^^ +643 < 1 in F, e^^ + < 1 is rank-0{k\ogt} derivable in 
CP(fc). 

Proof. For i be either ii or ^2 and for any c ^ d ^ [i]^^^ , in one step we can derive yi^cXi^c + 
yi,c'Xi^c' < 1 from the (IF) inequality yi^c + yi,c' < 1- 

For every pair of cells (ci, C2), we are also given the type (IIF) inequality yii,cif^n,ci+yi2,c23^j2,c2 ^ 

1. 

We need to derive X^cg^^-i ^n,cyii,c + Z]ce[^]'=-i ^i2,cyi2,c < 1- Thus we want that the sum of 
a set of 0{£^~^) variables to be at most 1, given that the sum of any pair of them is at most 1. By 
a result of ([7], Theorem 6.1), this can be done in rank 0(A;log£). □ 

Lemma 14.51 follows from Claims [461 and l4?7l □ 

Putting Lemmas 14.21 and 14.51 together we obtain the following separations of our proof system 
hierarchy. 

Theorem 4.8. For any e > there is a family of unsatisfiable CNF formulas G on n variables 
that requires nearly polynomial refutation rank n^(^/^°slog") Qj^d nearly exponential tree-like size 
exp(n^(^/^°s^°s"'') in allT'^^(k) systems but has logarithmic refutation rank and polynomial tree-like 
refutation size in CP(A;) systems for any k < (1 — e) log log n. 

Proof. The bounds for rank and the lower bounds for tree-like size follow immediately from Lem- 
mas 14.21 and 14. 5i The upper bound for tree-like size follows by expanding the logarithmic rank 
CP (A;) proof into a tree. □ 
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5 Integrality Gaps 



In this section we how how to use this approach to obtain not just rank lower bounds for unsatisfiable 
formulas, but integrality gaps for optimization problems as well. We will present an integrality gap 
for MAX-SAT as a canonical example. 

The MAX-SAT problem is well-studied in the theory of approximation algorithms and optimal 
inapproximability results are known under the assumption that P 7^ NP. There are also uncon- 
ditional inapproximability results known for a restricted class of algorithms that involve applying 
Cutting Planes or LS-|- procedures to a relaxation of the standard integer program (e.g. [7t l40]). 

Given a CNF formula G = {Ci A • • • A Cm} over variables xi, . . . , Xn, we can add a new set of 
variables zi, . . . , z^, and define C[ = -iZj V Ci. Let G' be C{ A • • • A C'^- If we convert these clauses 
into linear constraints and add Boolean constraints, we obtain a linear program Lq with objective 
function Zi that is a natural LP relation of the MAX-SAT problem for C. 

There are 2*(") clauses over n variables that contain exactly t different variables. Let A/^" be 
the probability distribution induced by choosing m of these clauses uniformly and independently. 

Let -F be a t-CNF formula. We consider C = Lift]'"2(i^) as described earlier, except that since 
we have set k = 1 and i = 2, the form of C can be considerably simplified. That is, the variables 
of C will consist of two bit-vectors, x and y, in which x will contain n blocks, each of size 2, y will 
be a vector of length n, where yi indicates which of the two elements of block i will be chosen in x. 
Each clause of F is transformed into 2* clauses in G, corresponding to each of the 2* possible bits 
of X that could be chosen by y. Thus if F has m clauses, each of size t, G has 2*m clauses, each of 
size 2i, and F is unsatisfiable if and only if G is unsatisfiable. (There is no need for the clauses on 
the yj-variables that were used in the case of larger i.) 

The following theorem, which is key to getting an integrality gap, is a quantitatively stronger 
version of Theorem 13.21 for the case k = \ and 1 = 2. 

Theorem 5.1. Let F he any t- CNF formula on n variables having resolution rank at least r, and 
let G = Yi\ii^2{P)- Then any R'^{2,C) refutation of G of rank R must have CRlogR > r^ for 
some constant 5 > 0. 

In order to prove Theorem 15.11 we will rely on the following stronger version of Theorem 13.21 
for the special case of 2 players due to Sherstov [56]. The version we have already seen requires 
that i be large, which becomes a problem for obtaining integrality gaps. The theorem below has 
much less dependence on the degree, and as a result it does not require £ to be large. However, this 
quantitatively stronger version below is currently only known to hold for 2-player communication 
complexity. 

Theorem 5.2. JJ^I Let f be a boolean function on n variables with sign-degree at least d (and 
hence 5/6-degree at least d). Then any 2-party communication protocol V computing g = f o ipj^ 
with error 1/3 must have \V\ = i}{d). 

Proof of Theorem \5.1\ We proceed as in the proof of Theorem 13.61 Suppose that P is a R'^{2, C) 
refutation of G = \^\itl2{T) of rank R. By Lemma [2.8( there exists a 2-party protocol V consistently 
computing Gsearch of error 1/3 such that \V\ is 0{CR\ogR). Now on the one hand, by Lemma [331 
there exists a consistent system Z = {fs : S C clauses(F)} of functions for F such that for every S, 
there exists a 2-party protocol Vs computing fs o ^/>J2 of error 1/3 such that \Vs\ is O (CRlogR). 
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On the other hand, by Propositions 12.21 and 12.31 we have 



D{fs) > 



riog2 iFii riog2 iFii ■ 

Finahy by Proposition EH we have d = deg^,^{fs) > {D{fs)Y'V^ > 

Now by the above Theorem 15.21 we must have CRlogR that is Q{d), which is for some 
constant 6 < 1. □ 

We now see how the above theorem can be apphed to derive an integrahty gap for small rank 
Th(l) or Cutting Planes proofs. 

Corollary 5.3. Let t > 3 be an integer. There exists 5 < 1 such that for all e > there is a 
A > 1 such that for a randomly chosen F from M]^, the integrality gap of any round Cutting 
Planes (or Th(l)J relaxation of Lq, the linear relaxation of the 2t-CNF G = LiftJ'2(-^), is at least 
1 — 1 /2^* + e with high probability. 

Proof Given e, fix A >> 2* ln2/e'^, where (1 - l/2* + e')(2V(2* - 1) -e) = 1. A random assignment 
satisfies each of i^'s clauses with probability 1 — 1/2*, so the expected number of satisfied clauses of 
is (1 — 1/2*) An. For appropriate choice of A, the probability that a random assignment satisfies 
more than a 1 — 1/2* + e' fraction of equations is less than 2~" by Chernoff bounds. Thus with 
high probability, no assignment satisfies more than a 1 — 1 /2* + e' fraction of -F's equations. By the 
construction of G from F, each clause of F has precisely 2* corresponding clauses in G. It follows 
that with high probability, no assignment satisfies more than a 1 — 1/2^* + e fraction of G's clauses. 

On the other hand, since t > 3, any Resolution refutation of F requires linear rank [13\ [5]. 
Thus by Theorem 15.11 even after Th(l) inference of rank (and in particular rounds of Cutting 
Planes), there is some non- integral assignment a to the x[s that satisfies all linear constraints 
corresponding to the clauses of G. Extend this assignment by setting all the Zj's to 1 and it follows 
that all constraints of Lq, are also satisfied. Thus we have a solution satisfying all equations that 
survives even after rounds. □ 

Note that since a random assignment on average satisfies a 1 — 1 /2^* fraction of clauses of any 
2t-CNF formula, this yields an optimal integrality gap for rank Th(l) inference for MAX-2t-SAT 
for any t > 3. Such a result was previously known for the special case of Cutting Planes proofs [7J 
but the proof relied on the specific form of inference rather than the general sound inference allowed 
for Th(l) proofs. 

Remark. We can obtain a similar integrality gap for any function where we can prove decision 
tree lower bounds. That is, take any optimization problem that can be expressed naturally as a 
t-CNF formula, and such that an integrality gap of 1 — 7 can be proven for decision trees. (Such 
a result is usually elementary to obtain.) Then by our lifting technique, we can show that any 
small R^'^{2, C) refutation (including Cutting Planes and Th(l) proofs) for the lifted version has an 
integrality gap of 1 — 7/2*. Our approach only works at present for proof systems that correspond 
to 2-player communication complexity. However, an extension of Theorem 15.21 to the multiparty 
setting (as was done with the qualitatively weaker Theorem 13.21 which was originally proven for the 
2-player case [l5]) would immediately yield integrality gaps for stronger matrix cut systems, such 
as LS+ and Lasserre. 
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6 Discussion 



In this paper we showed how to take an arbitrary 3-CNF formula F, and convert it into another 
CNF formula G so that the resolution rank of F becomes polynomial in the T^^{k + 1) rank of 
G, for k G O(loglogn). As applications, we obtained polynomial rank lower bounds for many 
commonly studied matrix cut proof systems, including Cutting Planes and the full complement 
of Lovasz-Schrijver variants, as well as non-constant rank lower bounds for Sherali-Adams and 
Lasserre proofs. We also use our approach to obtain new hierarchy theorems for the systems CP(A;) 
and LS\^^. 

While we focused on semi-algebraic systems in this paper, we would like to point out that our 
theorems can also be used to obtain non-constant rank lower bounds for many commonly studied 
algebraic systems, including Hilbert's Nullstellensatz and the Polynomial Calculus. While stronger 
lower bounds for these latter systems were already known prior to our work, our method achieves 
these lower bounds for a large class of new CNF formulas and, the technique is simple and generic. 
It should also be possible to obtain degree-based hierarchy theorems using our approach. 

There are several interesting open problems directly related to our work. First, our theorems 
as stated work for k up to (1 — o(l)) loglogn. We conjecture that it should be possible to derive 
hardness escalation results that work for k up to Vtilogn). A key problem with our approach is 
the tensor selector method, which when applied for larger /c, introduces superpolynomially many 
variables. A similar problem arose when proving lower bounds set disjointness and related functions 
in the NOF communication model. The initial results ( [3H I11]) used the tensor selector and worked 
for A; = (1 — o(l)) log log n; subsequent papers introduced new selector methods in order to prove 
lower bounds for k = J7(logn) ( jl4l [3].) On the other hand, proving hardness escalation results 
for k = a'(logn) appears to require very new ideas and would solve a major open problem in 
communication complexity and circuit complexity. 

Secondly, an important open problem is to strengthen our method to obtain not only tree-size 
lower bounds, but general (dag-like) size lower bounds. We note that this has already happened 
for k = 2, where initially Cutting Planes tree-size lower bounds were proven based on two-player 
communication complexity lower bounds [24] and the results were later generalized to obtain un- 
restricted Cutting Planes size lower bounds [6l |37]. Such a result, even for k = 3, would give 
unrestricted size lower bounds for Lovasz-Schrijver proofs, thus solving an important open prob- 
lem. 

Finally, there are many very interesting questions related to hardness escalation. What re- 
lationships are there between these various forms of hardness amplification, hardness escalation, 
hardness condensing, and hardness amplification? Are there other examples of hardness escalation, 
even under reasonable assumptions? In particular it would be very interesting to obtain a hardness 
escalation result that lifts lower bounds for a circuit class where cryptography is not possible to 
a circuit class were cryptography is possible (e.g., lifting from DNF lower bounds to TCq lower 
bounds) as such a result would cross the "natural proof barrier. 
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